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Abstract 

The ensemble Kalman filter (EnKF) is a recursive filter suitable for problems with a large 
number of variables, such as discretizations of partial differential equations in geophysical 
models. The EnKF originated as a version of the Kalman filter for large problems (essentially, 
the covariance matrix is replaced by the sample covariance), and it is now an important data 
assimilation component of ensemble forecasting. EnKF is related to the particle filter (in 
this context, a particle is the same thing as an ensemble member) but the EnKF makes the 
assumption that all probability distributions involved are Gaussian. This article briefly describes 
the derivation and practical implementation of the basic version of EnKF, and reviews several 
extensions. 

1 Introduction 

The Ensemble Kalman Filter (EnKF) is a Monte-Carlo implementation of the Bayesian update 
problem: Given a probability density function (pdf) of the state of the modeled system (the prior, 
called often the forecast in geosciences) and the data likelihood, the Bayes theorem is used to to 
obtain pdf after the data likelihood has beed taken into account (the posterior, often called the 
analysis). This is called a Bayesian update. The Bayesian update is combined with advancing 
the model in time, incorporating new data from time to time. The original Kalman Filter [18] 
assumes that all pdfs are Gaussian (the Gaussian assumption) and provides algebraic formulas for 
the change of the mean and covariance by the Bayesian update, as well as a formula for advancing 
the covariance matrix in time provided the system is linear. However, maintaining the covariance 
matrix is not feasible computationally for high-dimensional systems. For this reason, EnKFs were 
developed [__, [16]. EnKFs represent the distribution of the system state using a random sample, 
called an ensemble, and replace the covariance matrix by the sample covariance computed from 
the ensemble. One advantage of EnKFs is that advancing the pdf in time is achieved by simply 
advancing each member of the ensemble. For a survey of EnKF and related data assimilation 
techniques, see [12) . 
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2 A derivation of the EnKF 



2.1 The Kalman Filter 



Let us review first the Kalman filter. Let x denote the n-dimensional state vector of a model, and 
assume that it has Gaussian probability distribution with mean \x and covariance Q, i.e., its pdf is 



p(x) oc exp ^-^(x - fi) T Q l (x-fJ,)\ 



Here and below, oc means proportional; a pdf is always scaled so that its integral over the whole 
space is one. This probability distribution, called the prior, was evolved in time by running the 
model and now is to be updated to account for new data. It is natural to assume that the error 
distribution of the data is known; data have to come with an error estimate, otherwise they are 
meaningless. Here, the data d is assumed to have Gaussian pdf with covariance R and mean Hx, 
where H is the so-called the observation matrix. The covariance matrix R describes the estimate 
of the error of the data; if the random errors in the entries of the data vector d are independent, R 
is diagonal and its diagonal entries are the squares of the standard deviation ( "error size" ) of the 
error of the corresponding entries of the data vector d. The value Hx is what the value of the data 
would be for the state x in the absence of data errors. Then the probability density p(d|x) of the 
the data d conditional of the system state x, called the data likelihood, is 

p(d|x) ocexp (—(d-HxfR^id-Hx 

The pdf of the state and the data likelihood are combined to give the new probability density 
of the system state x conditional on the value of the data d (the posterior) by the Bayes theorem, 

p (x|d) oc p (d|x) p(x). 

The data d is fixed once it is received, so denote the posterior state by x instead of x|d and the 
posterior pdf by p (x) . It can be shown by algebraic manipulations [lj that the posterior pdf is also 
Gaussian, 

p (x) oc exp ~ A) T( _1 (x - A) 

with the posterior mean p, and covariance Q given by the Kalman update formulas 

A = n + K (d - Hp) , Q = (I- KB) Q, 

where 

K = QH T (HQH T + R) 1 
is the so-called Kalman gain matrix. 



2.2 The Ensemble Kalman Filter 

The EnKF is a Monte Carlo approximation of the Kalman filter, which avoids evolving the 
covariance matrix of the pdf of the state vector x. Instead, the distribution is represented by 
a collection of realizations, called an ensemble. So, let 

X = [xi, . . . ,xjv] = [Xj] 
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be an n x N matrix whose columns are a sample from the prior distribution. The matrix X is 
called the prior ensemble. Replicate the data d into anmxJV matrix 

D = [d lf ..., d N ] = [di] 

so that each column dj consists of the data vector d plus a random vector from the n-dimensional 
normal distribution N(0,R). Then the columns of 

X = X + K{D - HX) 

form a random sample from the posterior distribution. The EnKF is now obtained [T7j simply by 
replacing the state covariance Q in Kalman gain matrix K = QH^ i^HQH^ + i?) by the sample 
covariance C computed from the ensemble members (called the ensemble covariance). 

3 Theoretical analysis 

It is commonly stated that the ensemble is a sample (that is, independent identically distributed 
random variables, i.i.d.) and its probability distribution is represented by the mean and covariance, 
thus assuming that the ensemble is normally distributed. Although the resulting analyses, e.g., [7], 
played an important role in the development of EnKF, both statements are false. The ensemble 
covariance is computed from all ensemble members together, which introduces dependence, and 
the EnKF formula is a nonlinear function of the ensemble, which destroys the normality of the 
ensemble distribution. For a mathematical proof of the convergence of the EnKF in the limit for 
large ensembles to the Kalman filer, see |23j . The core of the analysis is a proof that large ensembles 
in the EnKF are, in fact, nearly i.i.d. and nearly normal. 

4 Implementation 

4.1 Basic formulation 

Here we follow [lOl US]- Suppose the ensemble matrix X and the data matrix D are as above. 
The ensemble mean and the covariance are 

1 JL A A T 

E (x) = -y^ k , c=4^—, 

y > TV ^ N - 1 

fc=i 

where 

A = X-E{X)=X-^ (Xejvxi) e lxN , 

and e denotes the matrix of all ones of the indicated size. 
The posterior ensemble X v is then given by 

X « X p = X + CH T (HCH T + R)' 1 (D - HX), 

where the perturbed data matrix D is as above. It can be shown that the posterior ensemble 
consists of linear combinations of members of the prior ensemble. 
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Note that since R is a covariance matrix, it is always positive semidefinite and usually 
positive definite, so the inverse above exists and the formula can be implemented by the Choleski 
decomposition [19]. In [3 [10], R is replaced by the sample covariance DD T / (N - 1) and the inverse 
is replaced by a pseudoinverse, computed using the Singular Values Decomposition (SVD). 

Since these formulas are matrix operations with dominant Level 3 operations p3], they are 
suitable for efficient implementation using software packages such as LAPACK (on serial and 
shared memory computers) and ScaLAPACK (on distributed memory computers) [19]. Instead 
of computing the inverse of a matrix and multiplying by it, it is much better (several times 
cheaper and also more accurate) to compute the Choleski decomposition of the matrix and treat 
the multiplication by the inverse as solution of a linear system with many simultaneous right-hand 
sides [T4] . 

4.2 Observation matrix-free implementation 

It is usually inconvenient to construct and operate with the matrix H explicitly; instead, a function 
h(x) of the form 

/i(x) = Fx, (1) 

is more natural to compute. The function h is called the observation function or, in the inverse 
problems context, the forward operator. The value of h(x) is what the value of the data would be 
for the state x assuming the measurement is exact. Then [19} [22] the posterior ensemble can be 
rewritten as 

X p = X + j^-^A (HA) T P- 1 (D - HX) 

where 

HA = HX-± ((HX) e Nxl ) e lxN , 

and 

P=-L-HA(HAf + R, 

with 

1 N 

[HA} i = H Xi -H-J2^ 
i=i 

1 N 

= h(x i )-j^J2h(x j ). 
j=l 

Consequently, the ensemble update can be computed by evaluating the observation function h on 
each ensemble member once and the matrix H does not need to be known explicitly. This formula 
also holds [19] for an observation function h(x.) = Hx + f with a fixed offset f , which also does 
not need to be known explicitly. The above formula has been commonly used for a nonlinear 
observation function h, such as the position of a hurricane vortex [8]. In that case, the observation 
function is essentially approximated by a linear function from its values at the ensemble members. 
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4.3 Implementation for a large number of data points 



For a large number m of data points, the multiplication by P~ 1 becomes a bottleneck. The following 
alternative formula |19[ [22] is advantageous when the number of data points m is large (such as 
when assimilating gridded or pixel data) and the data error covariance matrix R is diagonal (which 
is the case when the data errors are uncorrelated) , or cheap to decompose (such as banded due to 
limited covariance distance). Using the Sherman- Morrison- Woodbury formula [15] 



(R + UV 1 



R' 



R- X U(I + V 7 'R- 1 U)- 1 V' 1 R~ 



with 



gives 



U 



1 



N-l 



HA, V = HA, 



p- 1 =[R+ —^HA (HA) 



1 



N-l 



R- 



I- w ^(HA)(l + (HA) T R- 1 1 ^- [ (HA)] ' (HA) T R 



-i 



which requires only the solution of systems with the matrix R (assumed to be cheap) and of a 
system of size N with m right-hand sides. See [19] for operation counts. 



5 Further extensions 



The EnKF version described here involves randomization of data. For filters without randomization 
of data, see [21 dH [25]. 

Since the ensemble covariance is rank-deficient (there are many more state variables, typically 
millions, than the ensemble members, typically less than a hundred), it has large terms for pairs of 
points that are spatially distant. Since in reality the values of physical fields at distant locations 
are not that much correlated, the covariance matrix is tapered off artificially based on the distance, 
which results in better approximation of the covariance for small ensembles [13] , such as typically 
used in practice. Further development of this idea gives rise to localized EnKF algorithms [3[ I24|. 

For problems with coherent features, such as firelines, squall lines, and rain fronts, there is a need 
to adjust the simulation state by distorting the state in space as well as by an additive correction 
to the state. The morphing EnKF [20] employs intermediate states, obtained by techniques 
borrowed from image registration and morphing, instead of linear combinations of states. 

EnKFs rely on the Gaussian assumption, though they are of course used in practice for nonlinear 
problems, where the Gaussian assumption is not satisfied. Related filters attempting to relax the 
Gaussian assumption in EnKF while preserving its advantages include filters that fit the state pdf 
with multiple Gaussian kernels [4j, filters that approximate the state pdf by Gaussian mixtures [6], 
a variant of the particle filter with computation of particle weights by density estimation |20[ |2T] , 
and a variant of the particle filter with thick tailed pdfs to alleviate particle filter degeneracy [26J. 



5 



References 

[1] B. D. O. Anderson and J. B. Moore, Optimal filtering, Prentice-Hall, Englewood Cliffs, 
N.J., 1979. 

[2] J. L. Anderson, An ensemble adjustment Kalman filter for data assimilation, Monthly- 
Weather Review, 129 (1999), pp. 2884-2903. 

[3] , A local least squares framework for ensemble filtering, Monthly Weather Review, 131 

(2003), pp. 634-642. 

[4] J. L. Anderson and S. L. Anderson, A Monte Carlo implementation of the nonlinear 
filtering problem to produce ensemble assimilations and forecasts, Monthly Weather Review, 
127 (1999), pp. 2741-2758. 

[5] J. D. Beezley and J. Mandel, Morphing ensemble Kalman filters, Tellus, 60A (2008), 
pp. 131-140. 

[6] T. Bengtsson, C. Snyder, and D. Nychka, Toward a nonlinear ensemble filter for 
high dimensional systems, Journal of Geophysical Research - Atmospheres, 108(D24) (2003), 
pp. STS 2-1-10. 

[7] G. Burgers, P. J. van Leeuwen, and G. Evensen, Analysis scheme in the ensemble 
Kalman filter, Monthly Weather Review, 126 (1998), pp. 1719-1724. 

[8] Y. Chen and C. Snyder, Assimilating vortex position with an ensemble Kalman filter, 
Monthly Weather Review, 135 (2007), pp. 1828-1845. 

[9] G. Evensen, Sequential data assimilation with nonlinear quasi- geostrophic model using Monte 
Carlo methods to forecast error statistics, Journal of Geophysical Research, 99 (C5) (1994), 
pp. 143-162. 

[10] G. Evensen, The ensemble Kalman filter: Theoretical formulation and practical 
implementation, Ocean Dynamics, 53 (2003), pp. 343-367. 

[11] , Sampling strategies and square root analysis schemes for the EnKF, Ocean Dynamics, 

54 (2004), pp. 539-560. 

[12] G. Evensen, Data assimilation: The ensemble Kalman filter, Springer, Berlin, 2007. 

[13] R. Furrer and T. Bengtsson, Estimation of high- dimensional prior and posterior 
covariance matrices in Kalman filter variants, J. Multivariate Anal., 98 (2007), pp. 227-255. 

[14] G. H. Golub and C. F. V. Loan, Matrix Computations, Johns Hopkins Univ. Press, 1989. 
Second Edition. 

[15] W. W. Hager, Updating the inverse of a matrix, SIAM Rev., 31 (1989), pp. 221-239. 

[16] P. Houtekamer AND H. L. Mitchell, Data assimilation using an ensemble Kalman filter 
technique, Monthly Weather Review, 126 (1998), pp. 796-811. 



6 



[17] C. J. Johns and J. Mandel, A two-stage ensemble Kalman filter for smooth data 
assimilation, Environmental and Ecological Statistics, 15 (2008), pp. 101-110. 

[18] R. E. Kalman, A new approach to linear filtering and prediction problems, Transactions of 
the ASME - Journal of Basic Engineering, Series D, 82 (1960), pp. 35-45. 

[19] J. Mandel, Efficient implementation of the ensemble Kalman filter. CCM Report 231, 
University of Colorado Denver, 2006. 

[20] J. Mandel and J. D. Beezley, Predictor- corrector and morphing ensemble fil- 
ters for the assimilation of sparse data into high dimensional nonlinear systems. 
CCM Report 239, University of Colorado at Denver and Health Sciences Center. 
http://www.math.cudenver.edu/ccm/reports/rep239.pdf, November 2006. 11th Symposium 
on Integrated Observing and Assimilation Systems for the Atmosphere, Oceans, and Land 
Surface (IOAS-AOLS), CD-ROM, Paper 4.12, 87th American Meterological Society Annual 
Meeting, San Antonio, TX, January 2007, http://www.ametsoc.org. 

[21] , An ensemble Kalman-particle predictor- corrector filter for non-gaussian data 

assimilation. ICCS 2009, Lecture Notes in Computer Science, Springer, to appear, 2009. 
arXiv:0812.2290, 2008. 

[22] J. Mandel, J. D. Beezley, J. L. Coen, and M. Kim, Data assimilation for wildland fires: 
Ensemble Kalman filters in coupled atmosphere- surface models. arXiv:0712.3965, 2007. 

[23] J. Mandel, L. Cobb, and J. D. Beezley, On the convergence of the ensemble Kalman 
filter. arxiv:0901.2951, 2009. 

[24] E. Ott, B. R. Hunt, I. Szunyogh, A. V. Zimin, E. J. Kostelich, M. Corazza, 
E. Kalnay, D. Patil, and J. A. Yorke, A local ensemble Kalman filter for atmospheric 
data assimilation, Tellus, 56A (2004), pp. 415-428. 

[25] M. K. Tippett, J. L. Anderson, C. H. Bishop, T. M. Hamill, and J. S. Whitaker, 
Ensemble square root filters, Monthly Weather Review, 131 (2003), pp. 1485-1490. 

[26] P. van Leeuwen, A variance-minimizing filter for large-scale applications, Monthly Weather 
Review, 131 (2003), pp. 2071-2084. 



7 



